Fast Parallel Association Rule Mining without Candidacy Generation
نویسندگان
چکیده
In this paper we introduce a new parallel algorithm MLFPT (Multiple Local Frequent Pattern Tree) [11] for parallel mining of frequent patterns, based on FP-growth mining, that uses only two full I/O scans of the database, eliminating the need for generating the candidate items, and distributing the work fairly among processors. We have devised partitioning strategies at different stages of the mining process to achieve near optimal balancing between processors. We have successfully tested our algorithm on datasets larger than 50 million transactions.
منابع مشابه
Parallel Association Rule Mining with Minimum Inter-Processor Communication
Existing parallel association rule mining algorithms suffer from many problems when mining massive transactional datasets. One major problem is that most of the parallel algorithms for a shared nothing environment are Aprioribased algorithms. Apriori-based algorithms are proven to be not scalable due to many reasons, mainly: (1) the repetitive I/O disk scans, (2) the huge computation and commun...
متن کاملCOFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation
Existing association rule mining algorithms suffer from many problems when mining massive transactional datasets. Some of these major problems are: (1) the repetitive I/O disk scans, (2) the huge computation involved during the candidacy generation, and (3) the high memory dependency. This paper presents the implementation of our frequent itemset mining algorithm, COFI, which achieves its effic...
متن کاملMulti-objective Genetic Algorithm for Association Rule Mining Using a Homogeneous Dedicated Cluster of Workstations
This study presents a fast and scalable multi-objective association rule mining technique using genetic algorithm from large database. The objective functions such as confidence factor, comprehensibility and interestingness can be thought of as different objectives of our association rulemining problem and is treated as the basic input to the genetic algorithm. The outcomes of our algorithm are...
متن کاملEffective Positive Negative Association Rule Mining Using Improved Frequent Pattern Tree
Association Rule is an important tool for today data mining technique. But this work only concern with positive rule generation till now. This paper gives study for generating negative and positive rule generation as demand of modern data mining techniques requirements. Here also gives detail of “A method for generating all positive and negative Association Rules” (PNAR). PNAR help to generates...
متن کاملEffective Positive Negative Association Rule Mining Using Improved Frequent Pattern
Association Rule is an important tool for today data mining technique. But this work only concern with positive rule generation till now. This paper gives study for generating negative and positive rule generation as demand of modern data mining techniques requirements. Here also gives detail of “A method for generating all positive and negative Association Rules” (PNAR). PNAR help to generates...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001